The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

نویسندگان

  • Robert Jenssen
  • Deniz Erdogmus
  • José Carlos Príncipe
  • Torbjørn Eltoft
چکیده

A new distance measure between probability density functions (pdfs) is introduced, which we refer to as the Laplacian pdf distance. The Laplacian pdf distance exhibits a remarkable connection to Mercer kernel based learning theory via the Parzen window technique for density estimation. In a kernel feature space defined by the eigenspectrum of the Laplacian data matrix, this pdf distance is shown to measure the cosine of the angle between cluster mean vectors. The Laplacian data matrix, and hence its eigenspectrum, can be obtained automatically based on the data at hand, by optimal Parzen window selection. We show that the Laplacian pdf distance has an interesting interpretation as a risk function connected to the probability of error.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

Discussion of "Spectral Dimensionality Reduction via Maximum Entropy"

Since the introduction of LLE (Roweis and Saul, 2000) and Isomap (Tenenbaum et al., 2000), a large number of non-linear dimensionality reduction techniques (manifold learners) have been proposed. Many of these non-linear techniques can be viewed as instantiations of Kernel PCA; they employ a cleverly designed kernel matrix that preserves local data structure in the “feature space” (Bengio et al...

متن کامل

On the Existence of Kernel Function for Kernel-Trick of k-Means

This paper corrects the proof of the Theorem 2 from the Gower’s paper [3, page 5]. The correction is needed in order to establish the existence of the kernel function used commonly in the kernel trick e.g. for k-means clustering algorithm, on the grounds of distance matrix. The scope of correction is explained in section 2. 1 The background problem Kernel based k-means clustering algorithm (clu...

متن کامل

A Geometry Preserving Kernel over Riemannian Manifolds

Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004